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PRODUCTION OF EPIDERMAL GROWTH FACTOR IN 
METHYLOTROPHIC YEAST CELLS 



5 Field of the Invention 

This invention relates to a process of 
recombinant DNA technology for producing epidermal growth 
factor (EGF) peptides in methylotrophic yeast such as 

10 Pichia pastoris. Methylotrophic yeast transf ormants 

containing in their genome at least one copy of a DNA 
sequence operably encoding an EGF peptide under the 
regulation of a promoter region of a gene of a 
methylotrophic yeast and the S. cerevisiae alpha-mating 

15 factor (AMF) pre-pro sequence are cultured under 

conditions allowing the expression of EGF peptides into 
the culture medium. The invention further relates to the 
methylotrophic yeast transf ormants, DNA fragments and 
expression vectors used for their production and cultures 

20 containing same. 

Background of the Invention 

Epidermal growth factor (EGF) is a naturally- 
25 occurring, relatively short, single-chain polypeptide, 
which was first isolated from the mouse submaxillary 
gland. A structurally very similar polypeptide was later 
detected and isolated from human urine at low (about 30 
ng/ml) concentrations. Both mouse and human epidermal 
30 growth factors (the latter also called urogastrone in 
some earlier publications) contain 53 amino acids. 
Thirty-seven of these are identical in the amino acid • 
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sequences of mouse epidermal growth factor (m£6F) and 
human epidermal growth factor (hEGF) , as are the relative 
positions of the three disulfide bonds present in the 
structure. [Gregory, Nature , 325 (1975); Gregory et 

5 al., HoDPe-Sevler»s Z. Phvslol, Chem. , 356, 1765 (1975)]. 

The amino acid sequence of the form of hEGF containing 53 
amino acids (/3-hEGF) , as reported in the literature, is 
as follows: 

Asn Ser Asp Ser Glu Cys Pro Leu Ser 
10 His Asp Gly Tyr Cys Leu His Asp Gly 

Val Cys Met Tyr lie Glu Ala Leu Asp 

Lys Tyr Ala Cys Asn Cys Val Val Gly 

Tyr He Gly Glu Arg Cys Gin Tyr Arg 

Asp Leu Lys Trp Trp Glu Leu Arg 
15 The polypeptide also exists as a 52 amino acid 

form (gamma-hEGF) that lacks the C-termlnal arginine 
residue found in p-hEGF. 

The amino acid and nucleotide sequences of hEGF 
are, for example, disclosed in Hollenberg, "Epidermal 
20 Growth Factor-Urogastrone, A Polypeptide Acquiring 

Hormonal States"; eds.. Academic Press, Inc., New York 
(1979), pp. 69-110; or Urdea et al., Proc. Natl, Acad. 
gci, ySA/ 8fi/ 7461 (1983). 

A 48 amino acid-containing form of hEGF 
25 (lacking the five C-terminal amino acids) is described in 
Japanese Patent Application 86146964, published 8 
February 1988 tinder No. 63003791. 

The molecule in natural form contains disulfide 
linkages between residues 6-20, 14-31 and 33-42, and ' 
30 arises from an about 1200 amino acid precursor molecule 
consisting of eight EGF-like regions [see e.g. Bell et 
al.. Nucleic Acid Research, 14, 21, 8427 (1986)]. A form 
of rat EGF containing 48 amino acids has recently been 
disclosed In the Japanese Patent Application 8736498, 
35 published 22 August 1988, under No. 63202387. Both mEGF 
and hEGF, as well as their known analogs, exhibit similar 
pharmacological activities, although the extent or 
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spectrum of activity may be different for different 
materials, in general EGF inhibits the secretion of 
gastric acid and promotes cell growth; therefore, it is 
targeted for therapeutic potential as, for example, an 
5 anti-ulcer agent and in external wound healing. 

Since isolation from natural sources is 
technically difficult, expensive, and time consuming, 
recent efforts have centered on the development of 
efficient reconbinemt methods for the production of EGF. 

^° Of tlie hosts widely used for the production of 

heterologous proteins, probably E. coli and Saccharomyces 
cerevisiae (Baker's yeast) are the best understood. 
However, E. coli tends to produce EGF in its reduced form 
which is not stable in the presence of endogenous 

15 bacterial proteases. Attempts to overcome this problem, 
e.g., by employing a suitable leader sequence in order to 
produce an insoluble fusion protein which can be readily 
recovered from the cell paste, resulted in other 
inconveniences, especially during purification of the 

20 product. 

Yeasts can offer clear advantages over bacteria 
in the production of heterologous proteins, which include 
their ability to secrete heterologous proteins into the 
culture medium. Secretion of proteins from cells is 

25 generally superior to production of proteins in the 

cytoplasm. Secreted products are obtained in a higher 
degree of initial purity and their further purification 
is easier to contend with without cellular debris. In 
the case of sulfhydryi-rich proteins there is another 

30 compelling reason for the development of hosts capable of 
secreting them into the culture medium: their correct 
tertiary structure is produced and maintained via 
disulfide bonds. The secretory pathway of the cell and 
the extracellular medium are oxidizing environments which 

35 can support disulfide bond formation [Smith, et al.. 

Science, 22S., 1219 (1985)]. In contrast, the cytoplasm 
is a reducing environment in which disulfide bonds cannot 
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fonn. Upon cell breakage, too rapid formation of 
disulfide linkages can result in random disulfide bond 
formation. Consequently, production of sulfhydryl rich 
proteins, such as EGF, containing appropriately formed " 
5 disulfide bonds can be best achieved by transit through 
the secretory pathway. 

Secretion of authentic biologically active 
human epidermal growth factor from S. cerevisiae is 
disclosed in European Patent Application Nos. 84104445.6 
10 and 84303783.9, published October 31, 1984 (No. 0 123 

289) and December 19, 1984 (No. o 128 733), respectively. 
The cited patent applications contain no details as to 
the level of secretion or the purity of hEGF obtained. 
In an article published in Proc. Wati . Acad, sr.,- np ft, 
15 fil, 4642 (1984) , Brake, inventor of European Patent 

Application No. 84104445.6, and his co-workers give more 
details of their laboratory-scale experiments. hEGF is 
produced in S. cerevisiae by means of an expression 
cassette containing a DNA sequence encoding mature hEGF 
joined to sequences encoding the leader region («pre-pro" 
segment) of the precursor of the yeast mating pheromone 
alpha-factor, in what appears to be the best experiment, 
hEGF was secreted into the shake flask culture medium in 
a concentration of about 4000 ng/ml. m view of the 
25 problems usually encountered with up-scaling the 
production of heterologous proteins in autonomous 
plasmid-based yeast systems, such as S. cerevisiae, there 
is no indication that hEGF production in S. cerevisiae 
could be at levels higher than those of that experimental 
30 system. 

According to the prior art methods hEGF is 
produced and secreted from yeast in mature form, usually 
containing 52 €UBino acids. 

To overcome the major problems associated with 
35 s. cerevisiae, e.g. loss of selection for plasmid 

maintenance and problems concerning plasmid distribution, 
. copy number and stability in fermentors operated at high' 



20 
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cell density, a yeast expression system based on the 
methylotrophic yeast Pichia pastoris has been developed. 
A key feature making this system unique lies with the 
promoter employed to drive heterologous gene expression. 
5 This promoter, which is derived from the methanol- 

regulated alcohol oxidase I (AOXl) gene of P. pastoris, 
is highly expressed and tightly regulated (see e.g. the 
European Patent Application No. 85113737.2, published 
June 4, 1986, under No. 0 183 071) . Another key feature 
10 of the P. pastoris expression system is the stable 

integration of expression cassettes into the P. pastoris 
genome, thus significantly decreasing the chance of 
. vector loss. 

Although P. pastoris has been used successfully 

15 for the production of 'various heterologous proteins, 
e.g., hepatitis B surface antigen [Cregg et al,, 
Bio/Technology 479 (1987) ] , lysozyme and invertase 
[Digan et al., Developments in Industrial Microbiology 
21, 59 (1988); Tschopp et al., Bio/TeehnQloay 1305 

20 (1987)3, endeavors to produce other heterologous gene 

products in Pichia, especially by secretion, have given 
mixed results. At our present level of understanding of 
the P. pastoris expression system, it is unpredictable 
whether a given gene can be expressed to an appreciable 

25 level in this yeast or whether Pichia will tolerate the 
presence of the recombinant gene product in its cells. 
Further, it is especially difficult to foresee if a 
particular protein will be secreted by P. pastoris, and 
if it is, at what efficiency. Even for S. cerevisiae, 

30 which has been considerably more extensively studied than 
P. pastoris, the mechanism of protein secretion is not 
well defined and understood. 

Summary of the Invention 

35 



The present invention provides an expression 
system suitable for the production of EGF. In addition. 
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the present invention provides a powerful method for the 
production of secreted EGF peptides in methylotrophic 
yeast such as Pichia pastoris, which method can be easily 
scaled up from shake**flask cultures to large fermentors 
5 with no loss in productivity and without making major 
changes in the fermentation conditions. The presently 
preferred yeast species for use in the practice of the 
present invention is Pichia pastoris, a known industrial 
yeast strain that is capable of utilizing methanol as the 

10 sole carbon and energy source (methylotroph) • We have 
surprisingly found that EGF peptides can be produced in 
and secreted from methylotrophic yeast such as P. 
pastoris very efficiently, by transforming a 
methylotrophic yeast with, and preferably integrating 

15 into the yeast genome/ at least one copy of a first DNA 
sequence operably encoding an EGF peptide, wherein said 
first DNA sequence is operably associated with a second 
DNA sequence encoding the S. cerevisiae alpha-mating 
factor (AMF) pre-pro sequence (including the proteolytic 

20 processing site: lys-arg) , and wherein both of said DNA 
sequences are under the regulation of a methenol 
responsive promoter region of a gene of a methylotrophic 
yeast. Methylotrophic yeast cells such as P. pastoris 
cells containing in their genome at least one copy of 

25 these DNA sequences efficiently produce biologically 
active EGF peptides as a medium secreted product. 

Accordingly, this invention relates to a 
methylotrophic yeast cell such as a P. pastoris cell 
containing in its genome at least one copy of a DNA 

30 sequence operably encoding an EGF peptide, operably 

associated with a DNA sequence encoding the S. cerevisiae 
AMF pre-pro sequence (including the proteolytic 
processing site: lys-arg) , both under the regulation of a 
promoter region of a gene of a methylotrophic yeast. 

35 According to another aspect, this invention 

relates to a DNA fragment containing at least one copy of 
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an expression cassette comprising in the reading frame 
direction of transcription, the following DNA sequences: 
(i) a promoter region of a methanol 
responsive gene of a methylotrophic yeast, 
5 (ii) a DNA sequence encoding a polypeptide 

consisting of: 

(a) the S. cerevisiae AMP pre-pro 
sequence, including the proteolytic 
processing site: lys-arg, and 
iO (b) a DNA secjuence encoding an EGF 

peptide; and 

(iii) a transcription terminator functional in a 
methylotrophic yeast, 
wherein said DNA sequences are operationally associated 

15 with one another for transcription of the sequences 
encoding said polypeptide. 

The DNA fragment according to the invention can 
be transformed into the methylotrophic yeast cells such 
as P. pastoris cells as a linear fragment flanked by DNA 

20 sequences having sufficient homology with a target gene 
to effect integration of said DNA fragment therein. In 
this case integration takes place by replacement at the 
site of the target gene. Alternatively, the DNA fragment 
can be part of a circular plasmid, which may be 

25 linearized to facilitate integration, and will integrate 
by addition at a site of homology between the host and 
the plasmid sequence. 

The invention further concerns an expression 
vector containing at least one copy of an expression 

30 cassette described hereinabove. 

According to a still further embodiment, the 
invention relates to a process for producing EGF peptides 
by growing methylotrophic yeast transformants containing 
in their genome at least one copy of a DNA sequence 

35 operably encoding an EGF peptide, operably associated 

with DNA encoding the S. cerevisiae AHF pre-pro sequence, 
both under the regulation of a promoter region of a 
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conditions allowing the expression of said DNA sequence 
in said transformants and secreting mature EGF peptides 
into the culture medium. Cultures of viable 
methylotrophic yeast cells capable of producing EGF 
5 peptides are also within the scope of the invention. 

The polypeptide product is secreted to the 
culture medium at surprisingly high concentrations; the 
level of EGF peptides secretion is about two orders of 
magnitude higher than the best results published in the 

10 literature. In addition to the unique properties of the 
invention expression system, these present, excellent 
results are also due to the fact that the S. cerevisiae 
alpha-mating factor pre-pro sequence functions 
unexpectedly well to direct secretion of EGF peptides in 

15 methylotrophic yeast such as P. pastoris. 

Another surprising discovery is that the full 
length, 1-52 form of hEGF secreted by P. pastoris cells 
is not stable in the broth; it gets degraded to a shorter 
1-48 amino acid containing, stable form. The shorter 

20 hEGF form has essentially the same biological activity as 
the full length hEGF. 

The present invention is directed to the above 
aspects and all associated methods and means for 
accomplishing such. For example, the invention includes 

25 the technology requisite to suitable growth of the 
methylotrophic yeast host cells, fermentation, and 
isolation and purification of the EGF gene product. 

P. pastoris is described herein as a model 
system for the use of methylotrophic yeast hosts. Other 

30 useful methylotrophic yeasts can be taken from four 

genera, namely Candida, Hanensula, Pichia and Torulopsis. 
Equivalent species from them may be used as hosts herein 
primarily based upon their demonstrated characterization 
of being supportable for growth and exploitation on 

35 methanol as a single carbon nutriment source. See, for 
example, Gleeson et al.. Yeast ±, l (1988). 
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Brief Description of the Drawings 

Figure 1 shows the restriction map and insert 
seguence of the hEGF gene employed herein. 
5 Figure 2 shows the restriction map and insert 

sequence of the AMF pre-pro fragment (including the 
proteolytic processing site: lys-arg) employed herein. 
Figure 3 is a restriction map of plasmid 

pA0208. 

10 Figure 4 is a restriction map of plasmid 

PA0817. 

Figure 5 is a restriction map of plasmid 

PEGF819. 

15 Detailed Description of the Invention 

The term "epidermal growth factor" or "EGF 
peptide" or simply "EGF", as used throughout the 
specification and in the claims, refers to a polypeptide 

20 product which exhibits similar, in-kind, biological 
activities to natural human epidermal growth factor 
(hEGF) , as measured in recognized bioassays, and has 
substantially the same amino acid seguence as hEGF, 
including the 53, 52 and 48 amino acid forms. It will be 

25 understood that polypeptides deficient in one or more 
amino acids in the amino acid sequence reported in the 
literature for naturally occurring hEGF, or polypeptides 
containing additional amino acids or polypeptides in 
which one or more amino acids in the amino acid sequence 

30 of natural hEGF are replaced by other amino acids are 
within the scope of the invention, provided that they 
exhibit the functional activity of hEGF, e.g., inhibition 
of the secretion of gastric acid and promotion of cell 
growth. The invention is intended to embrace all the 

35 allelic variations of hEGF. Moreover, as noted Supra . 

derivatives obtained by simple modification of the amino 
acid sequence of the naturally occurring product, e.g, by 



10 
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way of site-directed mutagenesis or other standard 
procedures, are included within the scope of the present 
invention. EGF forms produced by proteolysis of host 
cells that exhibit similar biological activities to 
5 mature, naturally occurring hBGP are also encompassed by 
the present invention. 

The amino acids, which occur in the various 
amino acid sequences referred to in the specification 
have their usual, three- and one-letter abbreviations, 
routinely used in the art, i.e.: 

Amix)o Acid Abbreviation 



35 



40 



L-Alanine Ala a 

L-Arginine Arg r 

N 
D 

c 
Q 
E 
6 
H 
I 
L 
K 
M 
F 
P 
S 
T 



L-Asparagine Asn 

^5 L-Aspartic acid Asp 

L-cysteine cys 

L-61utamine Gin 

L-Glutamic Acid gIu 

L-Glycine Gly 

20 L-Histidine His 

L-lsoleucine jie 

L-Leucine Leu 

L-Lysine i,ys 

L-Methionine Met 

25 L-Phenylalanine Phe 

L-Proline pro 

L-Serine ser 

L-Threonine Thr 



L-Tryptophan Trp w 

30 L-Tyrosine Tyr y 

L-Valine val v 



According to the invention, EGF peptides are 
produced by methylotrophic yeast cells containing in 
their genome at least one copy of a DNA sequence operably 
encoding EGF peptides operably associated with DNA 
encoding the S. cerevisiae a-mating factor (AMF) pre-pro 
sequence (including the proteolytic processing site: lys- 
arg) , both under the regulation of a promoter region of a 
methanol responsive gene of a methylotrophic yeast. 

The term "a DNA sequence operably encoding EGF 
peptides" as used herein includes DNA sequences encoding 
the 53, 52 and 48 amino acid forms of hEGF or any Olivier 
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"EGF peptide" as defined herein above • DNA sequences 
encoding EGF, e.g. hEGP, are known in the art. They may 
be obtained by chemical synthesis or by transcription of 
a messenger RNA (mRNA) corresponding to EGF to a 
5 complementary DNA (cDNA) and converting the latter into a 
double stranded cDNA. The mRNA can be isolated for 
example, from adult mouse kidney [Rail et al., Nature, 
113, 228 (1985)] or from adult human kidney (Bell et al.. 
Nucleic Acid Research, 14, 21, 8427 (1986)]. Chemical 

10 synthesis of a gene for human EGF is, for example, 

disclosed by Urdea et al., Suora . The requisite DNA 
sequence can also be removed, for example, by restriction 
enzyme digest of known vectors harboring the EGF gene. 
Examples of such vectors and the means for their 

15 preparation can be taken from the following publications: 
Brake et al.. Supra - e.g. the pBR322-based vector pYo 
EGF-21; Urdea et al.. Supra - plasmid pYEGF-2, etc. The 
structure of a preferred hEGF gene used in accordance 
with the present invention is further elucidated in the 

20 examples. 

The presently preferred promoter region 
employed to drive the EGF gene expression is derived from 
a methanol-regulated alcohol oxidase gene of P. pastor is. 
P. pastor is is known to contain two functional alcohol 

25 oxidase genes: alcohol oxidase I (AOXl) and alcohol 

oxidase II (A0X2) genes. The coding portions of the two 
AOX genes are closely homologous at both the DNA and the 
predicted amino acid sequence levels and share common 
restriction sites. The proteins expressed from the two 

30 genes have similar enzymatic properties but the promoter 
of the AOXl gene is more efficient and highly expressed, 
therefore, its use is preferred for EGF expression. The 
AOXl gene, including its promoter, has been isolated and 
thoroughly characterized [Ellis et al., Mol. cell. riqI. 

35 1111 (1985)]. 

The expression cassette used for transforming 
methylotrophic yeast cells contains, in addition to a 
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methenol responsive promoter of a nethylotrophic yeast 
gene and the EGP encoding DNA sequence (egf gene) , a dna 
sequence encoding the in-reading frame S. cerevisiae AMF 
pre-pro sequence, including a DNA sequence encoding the 
processing site: lys-arg (also referred to as the lys- 
arg encoding sequence) , and a transcription terminator 
functional in a methylotrophic yeast. 

The S. cerevisiae alpha-mating factor is a 13- 
residue peptide, secreted by cells of the "alpha" mating 
type, that acts on cells of the opposite "a" mating type 
to promote efficient conjugation between the two cell 
types and thereby formation of "a-alpha" diploid cells 

[Thomer et al.. The Moleeul ar Bimnrry ^■.he Yeasf. 

Sacchayomyc« i ^s , Cold Spring Harbor Laboratory, cold Spring 
15 Harbor, NY, 143 (1981)]. The AMF pre-pro sequence is a 
leader sequence contained in the AMF precursor molecule, 
and includes the lys-arg encoding sequence which is 
necessary for proteolytic processing and secretion (see 
e.g. Brake et al.. Supra). The AMF pre-pro sequence, 
20 including the lys-arg encoding sequence is a 255 bp 
fragment which is shown in Figure 2. 

The transcription terminator functional in a 
methylotrophic yeast used in accordance with the present 
invention has a subsegment which encodes a 
polyadenylation signal and polyadenylation site in the 
transcript and/or a subsegment which provides a 
transcription termination signal for transcription from 
the promoter used in the expression cassette according to 
the invention (the term "expression cassette" as used 
herein and throughout the specification and claims refers 
to a DNA sequence which includes sequences functional for 
expression and the secretion processes) . The entire 
transcription terminator is taken from a protein-encoding 
gene, which nay be the same or different from the gene 
which is the source of the promoter used according to the 
Invention. 



25 



35 
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For the practice of the present invention it is 
preferred that multiple copies of the above-described 
expression cassettes be contained on one DNA fragment, 
preferably in a head-to-tail orientation. It is 
particularly preferred that four or more copies of the 
above-described expression cassette be contained on one 
DNA fragment. 

The DNA fragments according to the invention 
optionally further comprise a selectable marker gene. 
For this purpose, any selectable marker gene functional 
in methylotrophic yeast such as P. pastoris may be 
employed, i.e., any gene whic^ confers a phenotype upon 
methylotrophic yeast cells such as P. pastoris cells 
thereby allowing them to be identified and selectively 
grown from among a vast majority of xintrans formed cells. 
Suitable selectable marker genes include, for example, 
selectable marker systems composed of an auxotrophic 
mutant P. pastoris host strain and a wild type 
biosynthetic gene which complements the host's defect. 
For transformation of hiB4* P. pastoris strains, for 
example, the S. cerevisiae or P. pastoris HIS4 gene, or 
for transformation of arg4* mutants the S. cerevisiae ARG4 
gene or the P. pastoris AR64 gene, may be employed. 

If the yeast host is transformed with a linear 
DNA fragment containing the EGF gene under the regulation 
of a promoter region of a P. pastoris gene and AMT 
sequences necessary for processing- and secretion, the 
expression cassette is integrated into the host genome by 
any of the gene replacement techniques known in the art, 
such as by one-step gene replacement [see e.g., 
Rothstein, Methods Enzvmol. 101 . 202 (1983) ; Cregg et 

Bio/Technology 1, 479 (1987)] or by two-step gene 
replacement methods [see e.g., Scherer and Davis, Proc. 
W^tl. Acad. Sci. USA, 7^, 4951 (1979)]. The linear DNA 
fragment is directed to the desired locus, i.e., to the 
target gene to be disrupted, by means of flanking DNA 
sequences having sufficient homology with the target gene 



to effect integration of the DNA fragment therein. One- 
step gene disruptions are usually successful if the DNA 
to be introduced has as little as 0.2 Icb homology with 
the fragment locus of the target gene; it is however, 
preferable to maximize the degree of homology for 
efficiency. 

If the DNA fragment according to the invention 
is contained within or is an expression vector, e.g., a 
circular plasmid, one or more copies of the plasmid can 
be integrated at the same or different loci, by addition 
to the genome instead of by gene disruption. 
Linearization of the plasmid by means of a suitable 
restriction endpnuclease facilitates integration. 

The term "expression vector" includes vectors 
capable of expressing DNA sequences contained therein, 
where such sequences are in operational association with 
other sequences capable of effecting their expression, 
i.e., promoter sequences. In general, expression vectors 
usually used in recombinant DNA technology are often in 
the form of "plasmids", i.e., circular, double-stranded 
DNA loops which in their vector form, are not bound to 
the chromosome. In the present specification the terms 
"vector" and "plasmid" are used interchangeably. 
However, the invention is intended to include other forms 
of expression vectors as well, which function 
equivalently. 

In the DNA fragment according to the invention 
the segments of the expression cassette are 
"operationally associated" with one another. The DNA 
sequence encoding EGF peptides is positioned and oriented 
functionally with respect to the promoter, the DNA 
sequence encoding the S. cerevisiae AMF pre-pro sequence 
(including the DNA sequence encoding the AMF processing- 
site: lys-arg), and the transcription terminator. Thus, 
the polypeptide encoding segment is transcribed, under 
regulation of the promoter region, into a transcript 
capable of providing, upon translation, the desired 
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polypeptide. Because of the presence of the AMP pre-pro 
sequence, the expressed EGP product is found as a 
secreted entity in the culture medium. Appropriate 
reading frame positioning and orientation of the various 
segments of the expression cassette are within the 
knowledge of persons of ordinary skill in the art; 
further details are given in the Examples. 

The DNA fragment provided by the present 
invention may include sequences allowing for its 
replication and selection in bacteria, especially E. 
coli. In this way, large quantities of the DNA fragment 
can be produced by replication in bacteria. 

Methods of transforming methyl otrophic yeast 
such as Pichia pastoris as well as methods applicable for 
culturing methylotrophic yeast such as P. pastoris cells 
containing in their genome a gene for a heterologous 
protein are known generally in the art. 

According to the invention, the expression 
cassettes are transformed into the cells of a 
methylotrophic yeast either by the spheroplast technique, 
described by Cregg et al., Mol. ceii. nir^j 3375 
(1985) or by the whole-cell lithium chloride yeast 
transformation system [Ito et al., Aaric. Rim nh»^ 
341 (1984)], with modification necessary for adaptation 
25 to P. pastoris [See EP 312, 934], Although the whole- 
cell lithium chloride method is more convenient in that 
it does not require the generation and maintenance of 
spheroplasts, for the purpose of the present invention 
the spheroplast method is preferred, primarily since it 
30 yields a greater number of transformants. 

Positive transformants are characterized by 
Southern blot analysis [Maniatis et al.. Molecular 
C3,o^ing; — A Laboratory Manual ^ Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, New York, USA 
(1982)] for the site of DNA integration, Northern blots 
[Maniatis, Op, Pit. . R.S. Zitomer and B.D. Hall, j. sioi. 
Sti^, 251/ 6320 (1976) ] for methanol-responsive EGF gene 
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expression, and product analysis for the presence of 
secreted EGF peptides in the growth media. 

Transformed strains, which are of the desired 
phenotype and genotype, are grown in fermentors. For the 
5 large-scale production of recombinant ONA-based products 
in methylotrophic yeast such as P. pastoris, a three- 
stage, high cell-density, batch fermentation system is 
normally employed. In the first, or growth stage, 
expression hosts are cultured in defined minimal medium 

10 with excess glycerol as carbon source. When grown on 
this carbon source heterologous gene expression is 
completely repressed, which allows the generation of cell 
mass in the absence of heterologous protein expression. 
Next, a short period of glycerol limitation growth is 

15 allowed. Subsequent to the glycerol limited growth, 
methanol is added, initiating the expression of the 
desired heterologous protein. This third stage is the 
so-called production stage. 

The term "culture" means a propagation of cells 

20 in a medium conducive to their growth, and all sub- 
cultures thereof. The term "subculture" refers to a 
culture of cells grown from cells of another culture 
(source culture) , or any subculture of the source 
culture, regardless of the number of subculturings which 

25 have been performed between the subculture of interest 
and the source culture. 

According to a preferred embodiment of the 
invention, the heterologous protein expression system 
used for EGF production utilizes the promoter derived 
30 from the methanol-regulated AOXl gene of P. pastoris, 
which is very efficiently expressed and tightly 
regulated. This gene can be the source of the 
transcription terminator as well. The presently 
preferred expression cassette comprises, operationally 
35 associated with one another, a P. pastoris AOXl promoter, 
DNA encoding the S. cerevisiae AMF pre-pro sequence 
(including the DNA sequence encoding the AMF processing 
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Site: lys-arg) , a DNA sequence encoding mature hEGP, and 
a transcription terminator derived from the P. pastoris 
AQXl gene. Preferably, two or more of such expression 
cassettes are contained on one DNA fragment, in head-to- 
tail orientation, to yield multiple expression cassettes 
on a single contiguous DNA fragment. 

The presently preferred host cells to be 
transformed with multiple expression cassettes are P. 
pastoris cells having at least one mutation that can be 
complemented with a marker gene present on a transforming 
DNA fragment. Preferably his4- (6S115) or arg4- (GSISO) 
auxotrophic mutant P. pastoris strains are employed. 

The fragment containing multiple expression 
cassettes is inserted into a plasmid containing a marker 
15 gene complementing the host's defect. pBR322-based 
plasmids, e.g., pA0815, are preferred. Insertion of 
multiple copies of the hEGP expression/secretion cassette 
into parent plasmid pA08l5 produces plasmids pA08i7 and 
PE6F819. 

^° To develop Muf expression strains of P. 

pastoris, the transforming DNA comprising the expression 
cassette (s) is (are) preferably integrated into the host 
genome by a one-step gene replacement technique. The 
expression vector is digested with an appropriate enzyme 
25 to yield a linear DNA fragment with ends homologous to 
the AOXl locus by means of the flanking homologous 
sequences. This approach avoids the problems encountered 
with S. cerevisiae, wherein expression cassettes must be 
present on multicopy plasmids to achieve high level of 
30 expression. As a result of gene replacement, Muf strains 
are obtained. Mut refers to the methanol-utilization 
phenotype. in Muf strains, the AOXl gene is replaced 
with the expression cassette (s), thus decreasing the 
strains ability to utilize methanol. A slow growth rate 
on methanol is maintained by expression of the A0X2 gene 
product. The transformants in which the expression 
cassette has integrated into the AOXl locus by site- 
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directed recombination can be identified by first 
screening for the presence of the complementing gene. 
This is preferably accomplished by growing the cells in a 
media lacking the complementing gene product and 
identifying those cells which are able to grow by nature 
of expression of the complementing gene. Next, the 
selected cells are screened for their Mut phenotype by 
growing them in the presence of methanol and monitoring 
their growth rate. 

To develop Mut* EGP-expressing strains, the 
fragment comprising one or more expression cassette (s) 
preferably is integrated into the host genome by 
transformation of the host with a circular plasmid 
comprising the expression cassette (s). The integration 
is by addition at a locus or loci having homology with 
one or more sequences present on the transformation 
vector. 

Positive transformants are characterized by 
Southern analysis for the site of DNA Integration, by 
Northern analysis for methanol-responsive egp gene 
expression, and by product analysis for the presence of 
secreted hEGF peptides in the growth media. 
Methylotrophic yeast strains which have integrated one or 
multiple copies of the expression cassettes at a desired 
site can be identified by Southern blot analysis. 
Strains which demonstrate enhanced secretion of hEGF may 
be identified by Northern or product analysis; however, 
this characteristic is not always easy to detect in 
shake- flask experiments. 

Methylotrophic yeast transformants which are 
identified to have the desired genotype and phenotype are 
grown in fermentors. Typically a three-step production 
process is used. Initially, cells are grown on a 
repressing carbon source, preferably excess glycerol, m 
this stage the cell mass is generated in absence of 
expression. Next, a short period of glycerol limitation 
growth is allowed. After exhaustion of glycerol. 
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methanol alone (methanol excess fed-batch mode) or 
limiting glycerol and methanol (mixed-feed fed-batch 
mode) are added in the fermentor, resulting in the 
expression of the hEGF gene driven by a methanol 
5 responsive promoter. The level of hEGF secreted into the 
media can be determined by Western blot analysis of the 
media in parallel with an EGF standard, using anti-EGF 
antisera, or by HPLC after suitable pretreatment of the 
medium. 

^° invention is further illustrated by the 

following non-limiting examples. 

Examples 

15 Example l 

The expression vector constructions 
disclosed in the present application were performed using 
standard procedures, as described, for example in 
Maniatis et al., SuEsa, and Davis et al., Basic Methnr^g 
in yiolecular Piology , Elsevier Science Publishing, Inc., 
New York (1986) . 

The hEGF gene was obtained from a pBR322-based 
plasmid on an Ncol-Hindlll fragment. The hEGF encoding 
fragment employed is shown in Figure 1. 

The AMF pre-pro encoding sequence (including 
the proteolytic processing site: lys-arg) employed in the 
present study was a 255 nucleotide fragment shown in 
Figure 2. 

This 255 nucleotide fragment was derived from 
30 plasmid pAO208, shown in Figure 3. The construction of 
plasmid pA0208 is described in detail below. 

Constru ction of plasmid pA020a 
The AOXl transcription terminator was isolated 
from 20 ng of pP62.0 (pP62.0 - BamHl-Hindlll fragment of 
PG4.0 (NHRL 15868) + pBR322] by StuI digestion followed 
by the addition of 0.2 /tg Sail linkers (G6TCGACC) . The 
plasmid was subsequently digested with Hindlll and the 
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350 bp fragment isolated from a 10% acrylamide gel and 
sxibcloned into pUC18 (Boehringer Mannheim) digested with 
Hindlll and Sail. The ligation mix was transformed into 
JH103 cells (that are widely available) and amp" colonies 
5 were selected. The correct construction was verified by 
HindZII and Sail digestion, which yielded a 350 bp 
fragment, and was called pA0201. 

5 fig of PA0201 was digested with Hindlll, 
filled in using Klenow polymerase, and 0.1 /xg of Bglll 

10 linkers (GAGATCTC) were added. After digestion of the 
excess Bglll linkers, the plasmid was reclosed and 
transformed into MC1061 cells. Amp" cells were selected, 
DNA was prepared, and the correct plasmid was verified by 
Bglll, Sail double digests, yielding a 350 bp fragment, 

15 and by a Hindlll digest to show loss of Hindlll site. 
This plasmid was called pAO202. 

The alpha factor-GRF fusion was isolated as a 
360 bp BamHI-Pstl pairtial digest from pysV201. Plasmid 
PYSV201 is the EcoRl-BamHI fragment of GRF-E-3 inserted 

20 into M13mpl8 (New England Biolabs) . Plasmid GRF-E-3 is 
described in EP 206,783. 20 ng of pysV201 plasmid was 
digested with BamHI and partially digested with Pstl. To 
this partial digest was added the following 
oligonucleotides : 

25 5» AATTCGATGAGATTTCCTTCAATTTTTACTGCA 3* 

3' 6CTACTCTAAAG6AA6TTAAAAATG 5*. 

Only the antisense strand of the oligonucleotide was 
kinase labelled so that the oligonucleotides did not 
polymerize at the 5'- end. After acrylamide gel 

30 electrophoresis (10%) , the fragment of 385 bp was 

isolated by electroelution. This EcoRI- BamHI fragment 
of 385 bp was cloned into pA0202 which had been cut with 
EcoRI and BamHI. Routinely, 5 ng of vector cut with the 
appropriate enzymes and treated with calf intestine 

35 alkaline phosphatase, was ligated with 50 ng of the 
insert fragment. HC1061 cells were transformed, amp*" 
cells were selected, and DNA was prepared, in this case. 
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the resulting plasmid, pA0203, was cut with EcoRI and 
Bglll to yield a fragment of greater than 700 bp. The a- 
factor-GRF fragment codes for the (1-40) leu^^ version of 
GRF and contains the processing sites lys-arg-glu-ala- 
5 glu-ala. 

The AOXl promoter was isolated as a 1900 bp 
EcoRI fragment from 20 /ig of pA0P3 and subcloned into 
EcoRI -digested pA0203. The development of pA0P3 is 
disclosed in EP 226,846 and described hereinbelow. 
10 HC1061 cells were transformed with the ligation reaction, 
amp" colonies were selected, and DNA was prepared. The 
correct orientation contains a «376 bp Hindlll fragment, 
whereas the wrong orientation has an «675 bp fragment. 
One such transformant was isolated and was called pA0204. 
^5 The parent vector for pA0208 is the HIS4, PARS2 

plasmid pYJ32 (NRRL B-15891) which was modified to change 
the EcoRV site in the tet" gene to a Bglll site, by 
digesting PYJ32 with EcoRV and adding Bglll linkers to 
create pYJ32 (+BglII) . This plasmid was digested with 
Bglll and the 1.75 Kb Bglll fragment from pA0204 
containing the AOXl promoter-a mating factor-GRF-AOXl 3» 
expression cassette was inserted. The resulting vector 
was called pA0208. An EcoRI digest of pAO208 yielded an 
850 bp fragment + vector, while vector having the other 
25 orientation yielded a 1.1 Kb fragment + vector. 

h. Construction of plasmid pAOP3 ! 
1. Plasmid pPG2.5 [a pBR322 based plasmid 
containing the approximately 2.5 Kbp EcoRI -Sail fragment 
from plasmid pP64.0, which plasmid contains the primary 
30 alcohol oxidase gene (AOXl) and regulatory regions and 
which is available in an E. coli host from the Northern 
Regional Research Center of the United States Department 
of Agriculture in Peoria, Illinois as NRRL B-15868] was 
linearized with BafflHI. 
35 2. The linearized plasmid was digested with 

BAL31; 3. The resulting DNA was treated with Klenow 
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fragment to enhance blunt ends^ and ligated to EcoRI 
linkers; 

4. The ligation products were transformed 
into E. coli strain MH294; 

5. Transformants were screened by the colony 
hybridization technique using a synthetic oligonucleotide 
having the following sequence: 

5 • TTATTCGA/J^CGGGAATTCC . 
This oligonucleotide contains the AOXl promoter sequence 
up to, but not including, the ATG initiation codon, fused 
to the sequence of the EcoRI linker; 

6. Positive clones were sequenced by the 
Maxam-Gilbert technique. All three positives had the 
following sequence: 

5 • • . .TTATTCGAAACGAGGAATTCC. . . 3 • • 
They all retained the «A" of the ATG (underlined in the 
above sequence) . It was decided that this A would 
probably not be detrimental; thus all subsequent clones 
are derivatives of these positive clones. These clones 
have been given the laboratory designation pAOPl, pA0P2 
and pA0P3 respectively. 

Construction o f the expression vector PA0817 

The hEGF gene and the AMF pre-pro sequence in 
the same translational direction were inserted into 
M13mpl9 [New England Biolabs] by the following procedure: 

10 ixg of M13mpl9 were digested with Smal and 
EcoRI and the large, about 7240 bp plasmid fragment was 
isolated on a 0.8% agarose gel. The plasmid fragment and 
a 267 bp fragment contaning the AMF pre-pro sequence 
(including the proteolytic processing site: lys-arg) were 
ligated together by T4 DNA ligase. The 267 bp fragment 
containing the AMF pre-pro sequence was obtained by 
digesting 15mg of plasmid pAO208 with Hindlll, filling in 
with Klenow-fragment DNA polynerase, and digesting with 
EcoRI. The digestion was run on a 1.7% agarose gel and 
the 267 bp fragment isolated. 
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The Ml3mpl9-AMF pre-pro sequence ligation 
mixture was then transformed into JM103 cells and DNA 
from the plaque was characterized. Plasmid DNA was 
prepared from these cells and was digested with Sail, 
filled in the Kl enow- fragment DNA polymerase, and cut 
with Hindlii. The about 7400 bp plasmid fragment was 
isolated and ligated to the 160 bp hEGF gene fragment. 
The ligation mixture was transformed into JMlOl cells and 
plaques were selected. Cells having the M13mpl9 plasmid 
with both the EGP gene and the AMF pre-pro sequence in 
the same translational direction were called pEGFl9-3. 

Jn vj.tro mutagenesis was performed on pEGF19-3 
to remove the polylinker of M13mpl9 and to place the DNA 
sequence encoding AMP processing site: lys-arg, directly 
15 in front of the first codon of mature E6F. The 

mutagenesis was accomplished using standard techniques 
[Zoller and Smith, Meth. Enzy mnl , iqq. (1983)]. The 

mutagenizing oligonucleotide employed was of the 
following sequence: 
20 5* TTC TTT GGA TAA AAG AAA TTC CGA TAG CGA GT 3 • . 

The screening oligonucleotide had the sequence: 
GATAAAAGAAATTCCGAT. The mutagenized plasmid was called 
pEGP19m-2 . 

EcoRI linkers of sequence GGAATTCC were added 
25 to the 3 • end of the hEGF gene in the plasmid pE6F19m-2 
by first digesting 20 ng of the plasmid with Hindlll and 
then filling in with Klenow-fragment DNA polymerase. 1 
ng of linkers and 20 of treated plasmid were ligated 
together and then digested with EcoRl to remove excess 
30 linkers. The «435bp EcoRI fragment was isolated on a 
1.5% agarose gel. 15 ^g of the plasmid pA0815 (the 
construction of which is described below) were digested 
with EcoRI and ligated to the 435 bp EcoRI fragment in a 
standard ligation reaction. The reaction was used to 
35 transform MCIO6I cells and amp" cells were selected. To 
determine which cells have a plasmid with the correct 
orientation of the AMF pre-pro sequence - hEGF gene 
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insert, plasaid DNA was prepared from the amp*^ colonies 
and was digested with PstI, A correct construct yielded 
an about 1740 bp fragment. Colonies demonstrating the 
correct restriction pattern were called pA0816. 
5 The complete, AOXl-promoter expression cassette 

was removed from pA0816 by digesting 15 m9 of pA0816 with 
Bglll and BamHI, and isolating the about 1670 bp fragment 
on a gel. The gel*^purified fragment was then ligated to 
BamHI-cut pA0816. The ligation mix was used to transform 

10 MC1061 cells and amp*^ colonies were selected. Colonies 
having plasmids comprised of two head-to-tail expression 
cassettes (referred to as pA0817) were identified by 
digestion with PstI, which gave fragments of 1827, 1497 
and 9547 bp. The restriction map of pA0817 is shown in 

15 Figure 4. 

d. Construction of plasmid PEGF819 
Plasmid pEGF819 was constiructed as follows: 
Plasmid pA0817 was digested with Bglll and 
BamHI and the 3600 bp fragment containing two expression 
20 cassettes was isolated on a 0.8% agarose gel. 250 ng of 
fragment and 25 ng of BamHI-cut phosphatase-treated 
pA0817 were ligated together. The ligation was used to 
transform MC1061 cells and Amp*^ cells were selected. DNA 
was prepared from the transformants and digested with 
25 Bglll and BamHI. The plasmid was characterized by 

digesting with multiple restriction enzymes and comparing 
the resultant banding pattern with the banding pattern of 
other restriction enzyme digested vectors comprising 
known numbers of hEGF-sized egression cassettes. Based 
30 on the mobility of the hEGF-containing fragment of 

pEGF819 relative to the mobility of expression vectors 
comprising four and six expression cassettes, it was 
concluded that pEGF819 has five expression cassettes in 
tandem. 

35 A restriction map of pEGF819 is shown in Figure 

5. 
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^ • Construction of plaam ld dA0804 

PlasmidB pA0004 nad pA0007 ore used in the 
construction of plasmid pA08l5« 

Plaemid pA0804 has been described in PCT 
5 Application No. WO 89/04320. Construction of this 
plasmid involved the following steps: 

Plasmid p]3R322 was modified as follows to 
eliminate the EcoRl site and Insert a Bglll site into the 
PvuII site; 

10 pBR322 was digested with EcoRI, the protruding 

ends were filled in with Klenow Fragment of E. coli DMA 
polymerase I, and the resulting DNA was recircularized 
using T4 ligaee. The recircularized DNA was used to 
tranBform coli MC1061 to ampicillin-reslBtance and 

15 transf ormants were screened for having a plasmid of about 
4,37 kpb in size without an EcoRI site. One such 
trans formant was selected and cultured to yield a 
plasmid, designated pBR322ARI, which is pBR322 With the 
EcoRI site replaced with the sequence: 

20 5 • -GAATTAATTC-3 ' 

3 • -CTTAATTAAG-5 • . 

PBR322aRI was digested with Pvull, and the 
linker having the sequence: 
5'-CAGATCTG-3» 

25 3 • -GTCTAGAC-5 • 

was lignted to the resulting blunt ends employing T4 
ligase. The resulting DNAs were recircularized, also 
with T4 ligase, and then digested with Bglll and again 
recircularized using T4 ligase to eliminate multiple 

30 Dgiii sites due to ligation of more than one linker to 
the PvuII-cleaved pBR322aRI. The DNAs, treated to 
eliminate multiple Dglll sites, were used to transform £• 
coli MC1061 to ampicillin-resistanoe. Transf ormants were 
^ screened for a plasmid of about 4.38 kbp with a BglZI 

35 site. One such transf ormant was selected and cultured to 
yield a plasmid, designated pBR322aRIBGL, for further 
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work. Plasmid pBR322ARIBGL is the same as pBR322 RI 
except that pBR322aRIBGL has the secpience 
5 • -CAGCAGATCTGCTG-3 • 
3 • -GTCGTCTAGACGAC-5 ' 
5 in place of the PvuII site in pBR322ARIt 

PBR322aRIBGL was digested with a Sail and Bglll 
and the large fragment (approximately 2.97 Icbp) was 
isolated. Plasmid pBSAGISI, which is described in 
European Patent Application Publication No. 0 226 752, 

10 was digested completely with Bglll and Xhol and an 

approximately 850 bp fragment from a region of the P. 
pastoris AOXl locus downstream from the AOXl gene 
transcription terminator (relative to the direction of 
transcription from the AOXl promoter) was isolated. The 

15 Bglll-Xhol fragment from pBSAGI5I and the approximately 
2.97 kbp, Sall-Bglll fragment from pBR322aRIBGL were 
combined and stibjected to ligation with T4 ligase. The 
ligation mixture was used to transform E. coli MC1061 to 
ampicillin-resistance and transformants were screened for 

20 a plasmid of the expected size (approximately 3.8 Icbp) 
with a Bglll site. This plasmid was designated pAOBOl. 
The overhanging end of the Sail site from the 
PBR322aRIBGL fragment was ligated to the overhanging end 
of the Xhol site on the 850 bp pBSAGISI fragment and, in 

25 the process, both the Sail site and the Xhol site in 
pAOSOl were eliminated. 

pBSAGISI was then digested with Clal and the 
approximately 2.0 kbp fragment was isolated. The 2.0 kbp 
fragment has an approximately 1.0-kbp segment which 

30 comprises the P. pastoris AOXl promoter and transcription 
initiation site, an approximately 700 bp segment encoding 
the hepatitis B virus surface antigen ("HBsAg") and an 
approximately 300 bp segment which comprises the P. 
pastoris AOXl gene polyadenylation signal and site- 

35 encoding segments and transcription terminator. The 
HBsAg coding segment of the 2.0 kbp fragment is 
terminated, at the end adjacent the 1.0 kbp segment with 
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the AOXl promoter, with an EcoRI site and, at the end 
adjacent the 300 bp segment with the AOXl transcription 
terminator, with a StuI site, and has its subsegment 
which codes for HBsAg oriented and positioned, with 
5 respect to the 1.0 kbp promoter-containing and 300 bp 

transcription terminator-containing segments, operatively 
for expression of the HBsAg upon transcription from the 
AOXl promoter. The EcoRI site joining the promoter 
segment to the HBsAg coding segment occurs just upstream 

10 (with respect to the direction of transcription from the 
AOXl promoter) from the translation initiation signal- 
encoding triplet of the AOXl promoter. 

For more details on the promoter and terminator 
segments of the 2.0 kbp, Clal-site-terminated fragment of 

15 pBSAGISI, see European Patent Application Publication No. 

226,846 and Ellis et al. . Mol. Cell Biol . 5, mi (1985). 

Plasmid pAOSOl was cut with Clal and combined 
for ligation using T4 ligase with the approximately 2.0 
kbp Clal-site-terminated fragment from pBSAGISI. The 

20 ligation mixture was used to transform E. coli MCIO6I to 
ampicillin resistance, and transformants were screened 
for a plasmid of the expected size (approximately 5.8 
kbp) which, on digestion with Clal and Bglll, yielded 
fragments of about 2.32 kbp (with the origin of 

25 replication and ampicillin-resistance gene from pBR322) 
and about 1.9 kbp, 1.48 kbp, and 100 bp. On digestion 
with Bglll and EcoRI, the plasmid yielded an 
approximately 2.48 kbp fragment with the 300 bp 
terminator segment from the AOXl gene and the HBsAg 

30 coding segment, a fragment of about 900 bp containing the 
segment from upstream of the AOXl protein encoding 
segment of the AOXl gene in the AOXl locus, and a 
fragment of about 2.42 kbp containing the origin of 
replication and ampicillin resistance gene from pBR322 

35 and an approximately 100 bp Clal-Bglll segment of the 
AOXl locus (further upstream from the AOXl-encoding 
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segment than the first mentioned 900 bp EcoRI-Bglll 
segment) , Such a plasmid had the Clal fragment from 
pBSAGISI in the desired orientation, in the opposite 
undesired orientation, there would be EcoRl-Bglll 
5 fragments of about 3.3 kbp, 2*38 kbp and 900 bp. 

One of the transformants harboring the desired 
plasmid, designated pA0802, was selected for further work 
and was cultured to yield that plasmid. The desired 
orientation of the Clal fragment from pBSAGlSI in pA0802 

10 had the AOXl gene in the AOXl locus oriented correctly to 
lead to the correct integration into the P. pastoris 
genome at the AOXl locus of linearized plasmid made by 
cutting at the Bglll site at the terminus of the 800 bp 
fragment from downstream of the AOXl gene in the AOXl 

15 locus. 

pA0802 was then treated to remove the HBsAg 
coding segment terminated with an EcoRI site and a StuI 
site. The plasmid was digested with StuI and a linker of 
sequence: 

20 5»-6GAATTCC-3* 

3»-CCTTAAGG-5» 
was ligated to the blunt ends using T4 ligase. The 
mixture was then treated with EcoRI and again subjected 
to ligating using T4 ligase. The ligation mixture was 

25 then used to transform E. coli MC1061 to ampicillin 

resistance and transformants were screened for a plasmid 
of the expected size (5.1 kbp) with EcoRI -Bglll fragments 
of about 1.78 kbp, 900 bp, and 2.42 kbp and Bglll-Clal 
fragment of about 100 bp, 2.32 kbp, 1.48 kbp, and 1.2 

30 kbp. This plasmid was designated pA0803. A transformant 
with the desired plasmid was selected for further work 
and was cultured to yield pA0803. 

Plasmid pA0804 was then made from pA0803 by 
inserting, into the BamHI site from pBR322 in pA0803, an 

35 approximately 2.75 kbp Bglll fragment from the P. 

pastoris HIS4 gene. See, e.g., Gregg et al. , Mol. Cell. 
If 3376 (1985) and European Patent Application 
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Publication Nos 180,899 and 188, 677, pA0803 was digested 
with BamHI and combined with the HIS4 gene-containing 
Bglll site-terminated fragment and the mixture subjected 
to ligation using T4 ligase. The ligation mixture was 
5 used to transform E. coli MC1061 to ampicillin-resistance 
and transformants were screened for a plasmid of the 
expected size (7.85 kbp) , which is cut by Sail. One such 
trans formant was selected for further work, and the 
plasmid it harbors was designated pA0804. 

10 PA0804 has one Sall-Clal fragment of about 1.5 

kbp and another of abut 5.0 kbp and a Clal-Clal fragment 
of 1.3 kbp; this indicates that the direction of 
transcription of the HIS4 gene in the plasmid is the same 
as the direction of transcription of the ampicillin 

15 resistance gene and opposite the direction of 
transcription from the AOXl promoter. 

The orientation of the HIS4 gene in pA0804 is 
not critical to the function of the plasmid or of its 
derivatives with cDNA coding segments inserted at the 

20 EcoRI site between the AOXl promoter and terminator 
segments. Thus, a plasmid with the HIS4 gene in the 
orientation opposite that of the HIS4 gene in pA0804 
would also be effective for use in accordance with the 
present invention. 

25 f . Construction of plasmid pAQSQ? 

1. Preparation of fl-ori DNA 
fl bacteriophage DNA (50 /ig) was digested 
with 50 units of Rsa I and Dra I (according to 
manufacturer's directions) to release the »458 bp DNA 

30 fragment containing the fl origin of replication (ori) . 

The digestion mixture was extracted with an equal volume 
of phenol: chloroform (V/V) followed by extracting the 
aqueous layer with an equal volume of chloroform and 
finally the DNA in the aqueous phase was precipitated by 

35 adjusting the NaCl concentration to 0.2M and adding 2.5 
volumes of absolute ethanol. The mixture was allowed to 
stand on ice (4*C) for 10 minutes and the DNA precipitate 
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was collected by centrlfugation for 30 minutes at 10,000 
X g in a microfuge at 4*C. 

The DNA pellet was washed 2 times with 70% 
aqueous ethanol. The washed pellet was vacuum dried and 
5 dissolved in 25 ^il of TE buffer. This DNA was 

electrophoresed on 1.5% agarose gel and the gel portion 
containing the «458 bp f 1-ori fragment was excised out 
and the DNA in the gel was electroeluted onto DE81 
(Watman) paper and eluted from the paper in IM NaCl. The 
10 DNA solution was precipitated as detailed above and the 
DNA precipitate was dissolved in 25 /il of TE buffer (fl-^ 
ori fragment) . 

2. Cloning of f 1-ori into Dra I sites of 
PBR322 

15 pBR322 (2 iig) was partially digested with 

2 units Dra I (according to manufacturer's instructions). 
The reaction was terminated by phenol: chloroform 
extraction followed by precipitation of DNA as detailed 
in step 1 above. The DNA pellet was dissolved in 20 ^1 

20 of TE buffer. About 100 ng of this DNA was ligated with 
100 ng of fl-ori fragment (step l) in 20 fil of ligation 
buffer by incubating at 14 'C for overnight with 1 unit of 
T4 DNA ligase. The ligation was terminated by heating to 
70 'C for 10 minutes and then used to transform E. coli 

25 strain Jl{103. Amp^ transformants were pooled and 

superinfected with helper phage R408. Single stranded 
phage were isolated from the media and used to reinfect 
JM103. Amp^ transformants contained pBRf 1-ori which 
contains fl-ori cloned into the Dra I sites (nucleotide 

30 positions 3232 and 3251) of pBR322. 

3. Construction of plasmid pA0807 

pBRf 1-ori (10 tig) was digested for 4 hours 
at 37 'C with 10 units each of Pst I and Nde I. The 
digested DNA was phenol: chloroform extracted, 
35 precipitated and dissolved in 25 /il of TE buffer as 
detailed in step 1 above. This material was 
electrophoresed on a 1.2% agarose gel and the Nde I - Pst 
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I fragment (approximately 0.8 kb) containing the fl-ori 
was isolated and dissolved in 20 /il of TE buffer as 
detailed in step 1 above. About 100 ng of this DNA was 
mixed with 100 ng of pA0804 that had been digested with 
5 Pst I and Nde I and phosphatase-treated. This mixture 

was ligated in 20 /xl of ligation buffer by incubating for 
overnight at 14 'C with 1 unit of T4 DNA ligase. The 
ligation reaction was terminated by heating at 70 'C for 
10 minutes. This DNA was used to transform E. coli 

10 strain JM103 to obtain pA0807. 

Construction of plasmid PA0815 
Plasmid pA0815 was constructed by mutagenizing 
plasmid pA0807 to change the Clal site downstream of the 
AOXl transcription terminator in pA0807 to a BamHI site. 

15 The oligonucleotide used for mutagenizing pA0807 had the 
following sequence: 

5 • GAC GTT CGT TTG TGC GGA TCC AAT GCG GTA GTT TAT 3 • . 
The mutagenized plasmid was called pA0807-Bam. Plasmid 
pA0804 was digested with Bglll and 25 ng of the 2400 bp 

20 fragment were ligated to 250 ng of the 5400 bp Bglll 

fragment from Bglll-digested pA0807-Bam. The ligation 
mix was transformed into MC1061 cells and the correct 
construct was verified by digestion with Pst/BamHI to 
identify 6100 and 2100 bp sized bands. The correct 

25 construct was called pA0815. 



Example 2 

Development of hEGF-secreting strains 
1. Mut" strains 

20 /ig of the expression vector pA0817 were 
digested with Bglll, which releases the AOXl-ended tandem 
expression cassette. The linear DNA fragment obtained by 
digestion (5 iig) was transformed into the P. pastoris 
strain GS115 (ATCC 20864) by the spheroplast method 
[Cregg et al., Mol. Ce^l. Bj^o^^, 5r 3376 (1985)]. His* 
cells were selected and the methanol utilization 
phenotype (Mut) of the cells was determined as follows: 
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His* transformants were plated on minimal 
glucose (2%) master plates to obtain colonies originating 
from single cells. After overnight incubation at 30*C, 
the masters were replica-plated to minimal glucose plates 
5 and plates containing no carbon source to which methanol 
was added in vapor phase. This is accomplished by adding 
an aliquot, approximately 200 m1# of methanol to the 
underside of the top of a covered petri dish. The plates 
were incubated at 30 for 4-6 days with additional MeOH 

10 added in the vapor phase every two days. Colonies 

showing visible growth were scored as Mut* and those with 
no visible growth were scored as Mut". 

Approximately 15% of the cells were His*Muf, 
indicating that the expression vector integrated 

15 correctly at the AOXl locus and disrupted the AOXl gene. 
Southern analysis of an EcoRI digest of the 
transformants, using the plasmid pA0803 as probe, 
confirmed the disruption of the AOXl gene and showed the 
number of expression units integrated. The strains were 

20 named as follows: 



Name 

G-EGP817S10 

G-EGP817S7 

6-E6F817S9 



Phenotvne 
MufHis* 
MufHis* 
MufHis* 



Site of 

Integration 

AOXl 

AOXl 

AOXl 



Copy Number 
One 
One 
Multiple 



In the above table copy number refers to the 
number of Bglll fragments integrated. Each Bglll 
fragment is comprised of two EGP expression cassettes. 

2. Mut* strains 

P. pastoris strain 6S115 (ATCC 20864) was 
transformed with 5 fig of uncut vector pA0817 using the 
spheroplast method of transformation. In this type of 
transformation the plasmid will integrate by addition 
into the P. pastoris genome at a site of homology between 
the plasmid and the host strain. The transformants were 
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screened for His*Mut* phenotype, and several were picked 
for Southern analysis. An EcoRI digest was probed with 
plasmid pYM4 [pYM4 was obtained by digesting pyM30 (NRRL 
B-15890) With Clal and religating the ends] and the 
5 hybridization pattern revealed two of the six had 
appropriate integrations: 

Site of 

Name Phenotyp e Integration Copv Number 

G+EGF817S1 Mut*His* HIS4 One 

10 G+EGr817S6 Mut*His* H1S4 One 



P. pastoris strain GS115 was transformed with 1 
Mg of uncut plasmid pEGF819 using the spheroplast method 
of transformation* The transfoxmants were screened for 

15 His* Mut* phenotype and several were picked for Southern 
analysis as described for the Mut* pA0817 transformants. 
Isolate G+EGF819S4 was further characterized by isolating 
a major portion of plasmid pEGF819 from the genome of 
G+EGF819S4, then digesting with multiple restriction 

20 enzymes, as follows: 

Genomic DNA was isolated from a culture of 
G+EGF819S4 grown from a single colony. The DNA was 
digested with the restriction enzyme Bglll and separated 
by agarose gel electrophoresis. As a size marker, 

25 plasmid pEGF819 was also digested with Bglll and 

electrophoresed on the same gel. Previous Southern 
analysis of Bglll-digested G+EGF819S4 DNA probed with 
PBR322 containing AOXl 5» and 3« regions, or HIS4, or an 
EGF-specific oligonucleotide, indicated that the Bglll 

30 genomic fragment of G+EGF819S4 containing multiple 

expression cassettes is approximately the same size as 
the Bglll fragment from the plasmid. Therefore, the area 
of the genomic digest corresponding in size to the Bglll 
fragment from the plasmid was eluted from the gel and 

35 cloned into pUC19 to create plasmid pEGF772-3. 

Plasmids pEGF772-3 and pEGF819 were analyzed by 
digestion with several restriction enzymes including 
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EcoRI, Hindlll, BamHI, EcoRV, Sail, Sad, PstI, Xbal, and 
double digests with EcoRl/BaniHI , Hindlll/BainHI. These 
digests indicated that although the genomic Bglll 
fragment of G+EGF819S4 and the Bglll fragment from 
5 PEGF819 are the same size, the 5' ends. of the fragments 
are different. The 5* end of the pEGF772-3 insert was 
sequenced to precisely identify how the two fragments 
differed. 

Partial sequence analysis of pEGF772-3 revealed 

10 that the 5' end of the genomic Bglll fragment contains 
the 3« portion of the AOXl gene instead of the expected 
AOXl promoter region from the first cassette. A subclone 
of the genomic fragment was generated in order to 
sequence the 5' end of the fragment completely and to 

15 more exactly determine the site of integration of the 
expression plasmid. Thus, pEGF772-3 was digested with 
EcoRl and reclosed with T4 DNA ligase. This procedure 
eliminated the majority of the genomic fragment leaving 
approximately 2500 bp of the 5* end. The new plasmid was 

20 called PEGF772-8. 

The sequence of this 5« region (approximately 
2200 bp sequenced) of the genomic Bglli fragment 
indicated that the 3' end of the AOXl gene including its 
transcription termination region are intact. It appears 

25 that expression plasmid pEGF819 integrated into the AOXl 
locus by recombination of the transcription termination 
region of the first expression cassette with the 
• homologous region in the AOXl gene. This recombination 
resulted in the loss of the first cassette as well as the 

30 2400 bp of PBR322 found upstream of the first expression 
cassette in the plasmid. 

The 3* end of the AOXl gene including its 
transcription termination segment contains approximately 
1500 bp of DNA, which is approximately the same size as 

35 one of the EGP expression cassettes. Thus, although the 
genomic Bglll fragment of G+EGF819S4 is approximately the 
same size as Bglll-linearized pEGF8l9, which contains 
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five copies of the EGF cassette, the expression strain 
contains only four copies of the expression cassette, as 
well as the 3' end of AOXl. 

To further characterize the expression 
5 cassettes integrated in G+EGF819S4, plasmid pE6F772-3 was 
digested with EcoRI which liberates four individual AMP- 
EGF fusion coding regions. The band corresponding to the 
cassettes was isolated from vector bands from an agarose 
gel and cloned into M13mpl8. Twelve of these M13 clones 
10 were sequenced by the Sanger dideoxy method, and all were 
found to have the expected nucleic acid sequence. 



Example 3 

Fezmentation of EGF strains 

15 a. Fermentor start-up and gener al operation 

The 2 -liter ferment ors (L.H. Fermentation, 
Hayward, CA; Biolafitte, LSL Biolafitte, Princeton, NJ) 
were autoclaved at a 700 ml volume containing 225 ml of 
lOX basal salts (52 ml/1 85% phosphoric acid, 1.8 g/1 

20 Calcium Sulphate-2H20, 28.6 g/1 Potassium Sulfate, 23.4 
g/1 Magnesium Sulfate-7H20, 6.5 g/1 Potassium Hydroxide) 
and 30 g glycerol. After sterilization, 3 ml of a YTM^ 
trace salts solution (5.0 ml/1 Sulfuric Acid, 65.0 g/1 
Ferrous Sulfate-7H20, 6.0 g/1 Copper Sulfate-5H20, 20.0 

25 g/1 Zinc Sulfate-7H20, 3.0 g/1 Manganese Sulfate-H20, 0.1 
g/1 Biotin) was added and the pH adjusted to 5.0 with the 
addition of concentrated Ammonium Hydroxide; the pH was 
then controlled at 5.0 with the addition of a 20% 
Ammonium Hydroxide solution containing 0.1% Struktol J673 

30 antifoam (Struktol Co., Stow, OH) throughout the 
fermentation. Excessive foaming was controlled 
throughout the fermentation by addition of Struktol J693 
antifoam when foam contacted a foam sensor in the 
fermentor. The fermentors were then inoculated with a 

35 10-50 ml volume of inoculum (overnight shake flask 

culture in phosphate-buffered 0.65% Yeast Nitrogen Base, 
pH6, containing 2% glycerol) . Upon exhaustion of the 
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initial glycerol charge, a glycerol feed was started as 
described below. The dissolved oxygen of the 
fermentation was maintained above 20% of air saturation 
by increasing the air flow rate up to 3 liter/minute and 
5 agitation speed up to 1500 3:pm during the fermentation. 

Ten^liter fermentations (in a 15-liter 
Biolafitte fermentor) were started in a 7.0 liter volume 
containing 4 liters of lOX basal salts and 520 g of 
glycerol for the Mut* methanol fed-batch protocol. After 

10 sterilization, 30 ml each of YTM^ and IM^ trace salts 
solutions were added and the pH was adjusted and 
subsequently controlled at 5.0 with the addition of 
ammonia gas throughout the fermentation. Excessive 
foaming was controlled with the addition of 5% struktol 

15 J673 antifoam. The feirmentor was inoculated with a 

volume of 200-500 ml. Upon exhaustion of the initial 
glycerol charge, a feed was started as outlined below. 
The dissolved oxygen was maintained above 20% by 
increasing the air flow rate up to 40 liter/minute, the 

20 agitation up to 1000 rpm and/ or the pressure of the 
fermentor up to 1.5 bar during the fermentation. 

b. Growth of Muf strains i n one-liter 
fermentors 

(1) Muf rNL) mixed-fee d fed batch 
25 fermentation 

Run 413:G-EGF817S10 
Run 419:6-EGF817S9 
Run 422:G-EGF817S9 
Run 423S6-EGF817S10 
30 Run 434:G-EGF817S9 

After the glycerol batch phase was 
completed, a 50% (by weight) glycerol feed, containing 12 
ml/1 YTM^ trace salts was started at 5.4 ml/h for the 2- 
liter fermentor. After 6 hours of glycerol feeding, the 
35 glycerol feed was decreased to 3.6 ml/h (36 ml/h at 10- 

liters) and a methanol feed containing 12 ml/1 yTM4 trace 
salts was initiated at 1.1 ml/h for the 2-liter 
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fermentor. After 5 hours, the methanol feed was adjusted 
to give a residual methanol concentration of up to about 
1%, preferably between 0.2 and 0.8%. The fermentation 
was sampled periodically and harvested 36-50 hr after the 

5 methanol feed was initiated. 

(2) Muf methanol"fed-batch 
Run 425:G-EGF817S9 
Run 426:G-EGF817S10 
After the glycerol batch phase was 

0 completed, an induced fed-batch phase was initiated by 
adding methanol to the fermentor to maintain a residual 
methanol concentration between 0.2 and 0.8%. The 
fermentor was sampled periodically and harvested after 
167 hr growth on methanol. 

S (3) Alternative procedure for product :jn^ 

of 1-52 hEGF 
Run 470:G-EGF817S9 
A two liter LH fermentor containing 
400 ml lOX basal salts, 80 g glycerol, and deionized 

0 water (to 1 liter) was sterilized. After sterilization 

and cooling, 3 ml YTM^ + biotin solution was added and 20% 
NH^OH used to bring pH to 3.6. The fermentor was 
inoculated with 60 ml of inoculum of Mut" cells and the pH 
controller set at 5.0. During batch growth, the 

5 agitation speed was adjusted upward periodically to 
maintain a dissolved oxygen tension above 20% air 
saturation. After exhaustion of the initial glycerol 
charge, a 50% solution of glycerol containing 12 ml/1 YTM^ 
+ biotin was pumped into the fermentor at the rate of 20 

0 ml/h. Four and one-half hours later, the glycerol feed 
rate was decreased to 10 ml/hr and a feed of methanol 
containing 12 ml/1 YTM* + biotin was started at l.O ml/h. 
Three hours later the methanol feed rate was doubled. 
After ninety minutes at 2 al/h, the methanol feed rate 

5 was adjusted to 3.8 ml/h and maintained constant until 
harvest at 13.5 hours after the methanol feed was first 
initiated. 
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c. Growth of Mut* strains in two^lii^ er and 1A> 
liter fermentors 

(1) Mut* methanol-fed-bat:i^^ 
Run 483:G+£6F819S4 (2L) 

5 Run 464:G+EGP817S1 (2L) 

Run 490;6+EGF819S4 (14L) 
After glycerol exhaustion, a 50% glycerol 
feed, containing 12 ml/1 YTM^ trace salts, was started at 
12 lal/h for the 2-liter or 200 ml/h for the 10-liter 

10 ferraentor and run for a total of 7 hours. After 6 hours 
on the glycerol feed, the methanol feed, containing 12 
ml/1 YTM^ trace salts, was started at 1.1 ml/h for the 2- 
liter and 11 ml/h for the 10-liter fermentor for 5 
minutes. When a rise in dissolved oxygen was seen after 

15 the methanol feed was shut-off, the methanol feed was 

turned back on for another 5 minute interval. The latter 
process was repeated several times until an immediate 
response in the dissolved oxygen was observed to the 
methanol feed cessation; once this occurred, the methanol 

20 feed was increased by 20% per hour at 30 minute 

intervals. The methanol feed was increased until a feed 
rate of 7.6 ml/h for the 2-liter or 90 ml/h for the 10- 
liter fermentor was reached. The fermentation was then 
carried out for 40-60 hours for the 2-liter or 25-35 

25 hours for the 10-liter fermentor. 

(2) Alternative procedure for growth of 
Hut^ strains 

Fifteen-liter fermentations employing 
strain G+EGF819S4 (in a 15-liter Biolafitte fermentor) 

30 were started in a six-liter volume containing four liters 
of lOX basal salts and 400 g of glycerol. After 
sterilization, 25 ml of PTM^ trace salts solution [6.0 g/L 
cupric sulfate»5H20, 0.08 g/L sodium iodide, 3.0 g/L 
manganese sulfate^HgO, 0.2 g/L sodium molybdate* 211^0, 0.02 

35 g/L boric acid, 0.05 g/L cobalt chloride, 56.0 g/L 

ferrous sulfate* 7H2O, 0.2 g/L biotin and 5.0 ml/L sulfuric 
acid (cone)] were added and the pH was adjusted and 
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15 



subsequently controlled at 5.0 with the addition of 
ammonia gas throughout the fermentation. Excessive 
foaming was controlled with the addition of 5% Struktol 
J673 antifoam. The fermentor was inoculated with a 
5 volume of 500 ml of an overnight culture (0D^= i to 4) 
in YNB, 2% glycerol, O.l H phosphate, pH 6. The 
dissolved oxygen was maintained above 20% by increasing 
the air flow rate up to 20 liter/minute, the agitation up 
to 1000 rpm and/or the pressure of the fermentor up to 
10 1.5 bar during the fermentation. 

After exhaustion of the initial glycerol 
charge, a 50% glycerol feed, containing 12 ml/L PTM, trace 
salts, was initiated at a rate of 120 ml/h; the glycerol 
feed continued for 6 hours, at which time the methanol 
feed, containing 12 ml/L PTM, trace salts, was started at 
a rate of 20 ml/h. The methanol feed was increased by 
20% each hour at half hour intervals until a methanol 
feed rate of lOp ml/h was reached. The fermentation was 
then continued for 25-35 hours. 

The conditions for 2- and 250-liter 
fermentations were scaled proportionately from the 15- 
liter fermentation, except that the final methanol feed 
rate was limited to the highest rate at which the 
dissolved oxygen concentration could be maintained above 
25 20% air saturation. In 2- and 250-liter fermentations, 
the pH was controlled with NH^OH rather than NH,, and in 
the 250 liter fermentation, the air sparge was 
supplemented with 02in some runs. 

30 Example 4 

Results of fermentations 
a. Mut' strair| g 

Th6 time course of cell growth and EGF 
expression in four fermentor runs of two Hut* strains was 
35 investigated. Cell growth for strains G-EGF817S9 and G- 
EGF817S10 under a methanol fed batch protocol was 
similar, yielding about 300 g/1 wet cells after 167 h. 



20 
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However, the multicopy strain 6-EGP817S9 produced 400 
mg/1 of EGP, twice as much as the strain with only two 
copies of the E6F expression cassette. The maximun 
concentration of EGF was reached after about 120 hours 
growth on methanol. 

A similar pattern is observed for the two 
strains growing more rapidly under the mixed-feed 
protocol, in this protocol, both strains again grew up 
to more than 300 g/l, and the 400 mg/1 of EGF produced by 
the multicopy strain is again higher than that produced 
by the double copy strain. 

The two fermentation protocols, i.e., the 
methanol fed batch and mixed feed protocols, were then 
carried out with the multicopy strain G-EGP817S9. The 
results demonstrate ttet a dramatically reduced time on 
methanol is required to produce EGF using mixed feed 
compared to using methanol alone, 35 hr ys^ 120 hr, 
respectively. The initial batch growth on glycerol to 
build up cell mass adds another 24 h to the overall 
20 process time. The EGF productivities for the methanol 
and mixed feed modes are 3 mgl"'h'' and 7 mgl"'h'^ 
respectively. 

b. Mut* stT-a^ng 

The time course of hEGP production at both IL 
25 (Run 483) and lOL (Run 490) volumes in fermentations 
employing the Muf strain G+E6P819S4 was investigated. 
Results are summarized in Table I. 



15 
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Table I 

Time Course of hEGF Production from 
Strain G+EGF819S4 

Time on methanol Approximate EOF 

Reaction scale, } feed, hr concn in broth, ma/l 



20 



4 




7 


/D 


12 




20 


305 


44 


465 


0 


25 


3 


60 


5 


130 


7 


160 


11 


260 


13 


340 


15 


330 


18 


345 


20 


290 


26 


325 


33 


400 


40 


610 



The higher hEGF production seen in these 
fermentations, 500-600 mg/L, as compared to the Mut" 
fermentations is due to the higher copy number of 
G+EGF819S4 (4) rather than the Mut* phenotype. A Mut* 
strain carrying two copies of the EGF gene, G+EGF817S1, 
produced hEGF at concentrations similar to those seen in 
a Muf strain carrying two copies of the hEGF gene. 

Example 5 

Analysis of secreted EGF 

1. Western analysis 

The first mode of analysis for evaluation 
of £idiia-produced EGF was the Western blot. Because 
antisera against hEGF can have low cross-reactivity to 
mEGF, it was necessary to obtain human EGF standard and 
antisera, instead of mEGF and anti-mEGF, respectively, 
for our analyses. The human EGF reagents were acquired 
commercially from Amgen (standard) and Biomedical 
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Technologies Inc., Stoughton, MA (antisera) . 
Electrophoresis of broth samples was conducted on a 15% 
polyacrylamide gel. Western blot analysis revealed that 
almost all the immunoreactive material was found in a 
5 single band which was of the same size as the human EGF 
standard. In several of the samples, (Runs 422, 423, 
425), a larger molecular weight species, approximately 33 
KD in size, was also seen to react with the antisera. By 
this analysis, the amount of EGF produced in runs with 
10 the multicopy strain appeared to be about twice as much 
as that produced in runs with the two copy strain. 

2. Stained aels 

Protein bands on the acrylamide gels were 
also visualized by staining with Coomasie blue. The 

15 primary protein species in the gel typically has an 

electrophoretic mobility similar to that of standard EGF. 
A further confirmation of the relative abundance of EGF 
protein in the broth was given by total protein assay of 
the broth. In the sample from Run 423, total TCA 

20 precipitable protein determined by the Lowry assay (100 
mg/1 ± 10 mg/1) was on the low end of the EGF 
concentration range estimated by Western blot with f-met- 
EGF standard (100-180 mg/1) . 

3. Separation of EGF peptides on HPLC 

25 Three peptides that eluted separately on 

reverse phase HPLC were purified to homogeneity by 
analytical HPLC. These peptides are designated with the 
niimbers 1, 2, and 4 in the order of decreasing elution 
time. Approximately 50 iig of each peptide was obtained 

30 in a volatile buffer. Peaks 1 and 2 were purified from 
Run 470; Peak 4 was purified from Run 425. Peak 3 was 
not purified due to its relatively low concentration. 

All three peaks were sxibmitted to quantitative 
amino acid analysis after hydrolysis in 6N HCl containing 

35 0.1% phenol. The compositions of hEGF Peaks l and 2 are 
consistent with an hEGF peptide that lacks a single 
arginine at the C-terminus. The composition of Peak 4, 
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on the other hand, shows lower amounts of leucine and 
glutamic acid, and suggests decreased yields of lysine 
and aspartic acid. All three peptides, however, possess 
an authentic amino-terminus as determined by automated 
5 Edman degradation. This suggests that the difference in 
the composition of Peptide 4 results from alteration at 
its carboxy -terminus • 

In an effort to determine the carboxy- terminal 
sequence of the peptides, they were each digested with 

10 carboxypeptidase Y (CPY) and the amino acids released 

over time were measured on an amino acid analyzer. The 
most rapidly released amino acids were leucine followed 
by glutamic acid. Thus, it was concluded that Peak 1 is 
a 1-52 product of the originally translated peptide; the 

15 carboxy-terminal arginine was probably removed by 

proteolysis during fermentation* Peak 2 gave similar 
results as Peak 1, but Peak 4 did not yield any amino 
acids. This negative result was difficult to interpret, 
but qould have been the result of a carboxy-terminal 

20 residue that is difficult for CPY to release, such as a 
lysine which occurs at position 48 of hEGF. 

To determine if the tryptophan residues at 
positions 49 and 50 were absent in Peak 4, one microgram 
of each peptide (1,2,4) was submitted to reverse phase 

25 HPLC on a chromatography system equipped with a diode 

array detector (Hewlett Packard 1090). Absorbance at 280 
nm and 210 nm was collected simultaneously for each 
peptide and the ratio 210 nm/280 nm was calculated both 
on the basis of peak height and integrated area. This 

30 ratio should be indicative of the tryptophan and tyrosine 
content of a peptide. More specifically, the ratio 
reflects the relative number of peptide bonds 
(contributors to 210 nm absorbance) to the number of 
tryptophan residues (contributors to 280 nm absorbance) . 

35 Tryptophan residues, when present in a sequence, tend to 
mask the smaller contribution of tyrosine to the 
absorbance at 280 nm. 
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The 210 nja/280 m absorbance ratios of hEGF 
Peaks 1 and 2 were in the same range as those of other 
proteins that have a similar content of tryptophan • Peak 
4, however, had a larger absorbance ratio which indicated 
5 the absence of any tryptophan residues. The unusually 

high number of tyrosine residues that remain in Peptide 4 
depress the value of the absorbance ratio slightly. 

The data from total amino acid analysis, N- 
terminal sequence,, carboxypeptidase Y digestion, and UV 
10 absorbance ratios indicated that both Peaks 1 and 2 are 
1-52 forms of hEGF while Peak 4 was a considerably 
shorter form. 

The molecular weight of peptide 4 was 
subsequently determined by mass spectrometry and was 
15 consistent with an hEGF peptide comprised of residues 1- 
48. Carboxy peptidase digests of the peptide confirmed 
that the C-terminal peptide is the 48th residue, lysine. 

4. Amino acid seouencina 

Fractions containing the HPLC peeOcs at 
20 22.47 min, 28.74 min, and 31.44 min were collected, and 
eight residues were sequenced on an automated gas phase 
protein microsequencer. Both the 22.47 min and 31.44 min 
peaks yielded the correct N-terminal sequence for EGF for 
the first eight residues. The peak at 28.74 min was not 
25 related to EGF. 

5. gtabj.],i,tY of segyet^d figF 
in fermentation broth of 
pjc^Aq p^stoi;j,s 

HPLC analysis of hEGF in the broth during the 
30 time course of the fermentation runs revealed that the 1- 
48 peptide was much more stable than the longer forms. 
The longer forms could be seen early after induction 
during the run. After 24h growth on methanol, peptide 4 
would accumulate, apparently as a degradation product of 
35 the other forms* Peptide 4 was very stable under 

fermentation conditions, persisting and accumulating for 
up to six days in the longer fermentation protocols. 
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This unexpected high stability makes production and 
purification of this form of hEGF much simpler than that 
of the longer forms. 

6. Biological activity 

The 1-48 hEGF peptide was tested for biological 
activity both in in vitro cell mitogenic assays and in 
vivo in stimulation of gastric ulcer healing. The 
peptide was observed to have high biological activity in 
both types of tests. 

The invention has been described in detail with 
reference to particular embodiments thereof. It will be 
understood, however, that variations and modifications 
can be effected within the spirit and scope of the 
invention. 
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CLAIMS: 

1. A DNA fragment comprising an expression 
cassette, wherein said expression cassette comprises, in 

5 the direction of transcription, the following DNA 
sequences : 

(i) a promoter region of a methanol 
responsive gene of a methyl otrophic yeast, 
(ii) a DNA sequence encoding a polypeptide 
10 consisting of: 

(a) the S. cerevisiae AMF pre-pro 
sequence, including the proce3sing 
s i te : ly s-arg , and 

(b) a DNA sequence encoding an £6F 
15 peptide; and 

(iii) a transcription terminator functional 
in a methylotrophic yeast, 
wherein said DNA sequences are operationally associated 
with one another for transcription of the sequences 
20 encoding said polypeptide. 

2. A DNA fragment according to Claim 1 
further comprising at least one selectable marker gene 
and a bacterial origin of replication. 

25 

3. A DNA fragment according to Claim 2 
wherein said fragment is contained within a circular 
plasmid. 

30 4. A DNA fragment according to Claim 1 

wherein said sequence encoding an £GF peptide encodes the 
1-53 or 1-48 form of EGF. 

5. A DNA fragment according to Claim 1 
35 wherein said methylotrophic yeast is a strain of Pichia 
pastoris. 
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6. A DNA fragment according to Claim 5 
wherein said methanol responsive gene of a methylotrophic 
yeast and the transcription terminator are both derived 
from the P. pastoris AOXl gene. 

5 

7. A DNA fragment according to Claim 6 
further comprising 3'- and 5 » -ends having sufficient 
homology with a target gene of a yeast host for said DNA 
fragment to effect site directed integration of said 

10 fragment into said target gene. 



8. A DNA fragment according to Claim 1 
further comprising 3«- and 5 • -ends having sufficient 
homology with a target gene of a yeast host for said DNA 

15 fragment to effect site directed integration of said 
fragment into said target gene. 

9. A DNA fragment according to Claim 1 
containing multiple copies of said expression cassette. 

20 

10. A DNA fragment according to Claim 9 
wherein said multiple copies of said expression cassette 
are oriented in head-to-tail orientation. 



25 11. A DNA fragment according to Claim 1 , which 

is derived from a Bglll digest of the Pichia expression 
vector pA0817. 

12. A DNA fragment according to Claim 7, which 
30 is the Pichia expression vector pEGP819. 

13. A DNA fragment according to Claim 7, which 
is derived from a Bglll-BamHI digest of the Pichia 
expression vector pA0816. 



35 



14. A methylotrophic yeast cell transformed 
with the DNA fragment of Claim 1. 
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15. A methylotrophic yeast cell according to 
claim 14 wherein said yeast is a strain of Pichia 
pastoris. 

5 16. A methylotrophic yeast cell transformed 

with the DNA fragment of Claim 4. 

17. A methyl otrophic yeast cell according to 
Claim 16 wherein said yeast is a strain of Pichia 

10 pastoris. 

18. A P. pastoris cell transformed with the 
DNA fragment of Claim 5. 

15 19. A P. pastoris cell transformed with the 

DNA fragment of Claim 6. 

20. A P. pastoris cell transformed with the 
DNA fragment of Claim 7. 

20 

21. A P. pastoris cell according to Claim 20, 
wherein said cell is selected from strain G-EGF817S10, 
G-EGF817S7, G-EGF817S9, G+EGF817S1, G+EGF817S4 or 
G+EGF817S6. 

25 

22. A methylotrophic yeast cell transformed 
with the DNA fragment of Claim 8. 

23. A methylotrophic yeast cell transformed 
30 with the DNA fragment of Claim 9. 

24. A methylotrophic yeast cell transformed 
with the DNA fragment of Claim 10. 

35 25. A P. pastoris cell transformed with the 

DNA fragment of Claim 11. 
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26. A P* pastoris cell according to Claim 25 
wherein said cell is selected from strain G-EGF817S10, 
G-EGF817S7 or G-EGF817S9, 

5 27. A P. pastoris cell transformed with the 

DNA fragment of Claim 12. 

28. A P. pastoris cell according to Claim 27 
wherein said cell is selected from strain G+EGF817S1, 

10 G+EGF817S4 or G+E6F817S6. 

29. A culture of viable P. pastoris cells 
according to Claim 13. 

15 30. A culture of viable P. pastoris cells 

according to Claim 21. 

31. A process for producing EGF, said process 
comprising growing the cells of Claim 14 under conditions 

20 allowing the expression of said expression cassette (s) in 
said cells, and the secretion of said E6F product into 
the culture medium. 

32. A process according to Claim 31 wherein 
•25 said methylotrophic yeast is a strain of Pichia pastoris. 

33. A process according to Claim 31 wherein 
said cells are grown in a medium containing methanol as a 
carbon source. 

30 

34. A process according to Claim 31 wherein 
said cells have the Hut* phenotype. 

35. A process according to Claim 31 wherein 
35 said cells have the Hut* phenotype. 
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